Bitmap Join Indexes vs. Data Partitioning

نویسنده

  • Ladjel Bellatreche
چکیده

IntroductIon Scientific databases and data warehouses store large amounts of data ith several tables and attributes. For instance, the Sloan Digital Sky Survey (SDSS) astronomical database contains a large number of tables with hundreds of attributes, which can be queried in various combinations (Papadomanolakis & Ailamaki, 2004). These queries involve many tables using binary operations, such as joins. To speed up these queries, many optimization structures were proposed that can be divided into two main categories: redundant structures like materialized views, advanced indexing schemes (bitmap, bitmap join indexes, These optimization techniques are used either in a sequential manner ou combined. These combinations are done intra-structures: materialized views and indexes for redundant and partitioning and data parallel processing for no redundant. Materialized views and indexes compete for the same resource representing storage, and incur maintenance overhead in the presence of updates (Sanjay, Chaudhuri & Narasayya, 2000). None work addresses the problem of selecting combined optimization structures. In this paper, we propose two approaches; one for combining a non redundant structures horizontal partitioning and a redundant structure bitmap indexes in order to reduce the query processing and reduce the maintenance overhead, and another to exploit algorithms for vertical partitioning to generate bitmap join indexes. To facilitate the understanding of our approaches, for review these techniques in details. Data partitioning is an important aspect of physical database design. In the context of rela-tional data warehouses, it allows tables, indexes and materialised views to be partitioned into dis-joint sets of rows and columns that are physically stored and accessed separately (Sanjay, Narasayya 2293 Bitmap Join Indexes vs. Data Partitioning & Yang 2004). It has a significant impact on performance of queries and manageability of data warehouses. Two types of data partitioning are available: vertical and horizontal partitionings. The vertical partitioning of a table T splits it into two or more tables, called, sub-tables or vertical fragment, each of which contains a subset of the columns in T. Since many queries access only a small subset of the columns in a table, vertical partitioning can reduce the amount of data that needs to be scanned to answer the query. Note that the key columns are duplicated in each vertical fragment, to allow " reconstruction " of an original row in T. Unlike horizontal partitioning, indexes or materialized views, in most of today's commercial database systems there is no native Database Definition Language (DDL) support for …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bringing Together Partitioning, Materialized Views and Indexes to Optimize Performance of Relational Data Warehouses

There has been a lot of work to optimize the performance of relational data warehouses. Three major techniques can be used for this objective : enhanced index schemes (join indexes, bitmap indexes), materialized views, and data partitioning. The existing research prototypes or products use materialized views alone or indexes alone or combination of them, but none of the prototypes use all three...

متن کامل

Automatic Selection of Bitmap Join Indexes in Data Warehouses

The queries defined on data warehouses are complex and use several join operations that induce an expensive computational cost. This cost becomes even more prohibitive when queries access very large volumes of data. To improve response time, data warehouse administrators generally use indexing techniques such as star join indexes or bitmap join indexes. This task is nevertheless complex and fas...

متن کامل

The Dimension-Join: A New Index for Data Warehouses

There are several auxiliary pre-computed access structures that allow faster answers by reading less base data. Examples are materialized views, join indexes, B-tree and bitmap indexes. This paper proposes dimension-join, a new type of index especially suited for data warehouses. The dimension-join borrows ideas from several concepts. It is a bitmap index, it is a multi-table join and when bein...

متن کامل

Yet Another Algorithms for Selecting Bitmap Join Indexes

One of the fundamental tasks that data warehouse (DW) administrator needs to perform during the physical design is to select the right indexes to speed up her/his queries. Two categories of indexes are available and supported by the main DBMS vendors: (i) indexes defined on a single table and (ii) indexes defined on multiple tables such as join indexes, bitmap join indexes, etc. Selecting relev...

متن کامل

Massive-Scale RDF Processing Using Compressed Bitmap Indexes

The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scientific data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-finding queries on this implicit multigraph in a SQLlike syntax. SPARQL quer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009